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Abstract 


This  work  investigated  the  feasibility  of  solving  Moving  Horizon  Esti- 
mation(MHE)  problems  in  hardware,  e.g.  Field  Programmable  Gate  Ar- 
rays(FPGA),  in  contrast  to  using  software.  With  the  hardware  approach 
we  could  build  customized  computational  machines  for  specific  algorithms 
and  hence  gain  computational  and  power  efficiency,  which  is  crucial  for  em¬ 
bedded  real-time  applications  where  computational  resources  are  limited.  A 
hardware  approach  enables  one  to  build  certifiable  components  and  is  also 
less  susceptable  to  software  virus  attack.  Using  high  level  synthesis  tools 
instead  of  low  level  hardware  description  languages,  we  demonstrated  that 
it  is  feasible  to  routinely  deploy  customised,  sophisticated  computational  al¬ 
gorithms  such  as  MHE,  on  reconfigurable  hardware  for  real-time  embedded 
applications. 

In  this  report,  we  first  reviewed  one  formulation  of  the  MHE  estimation 
which  is  amenable  for  both  linear  and  non-linear  systems.  Numerical  simu¬ 
lations,  including  a  radar  tracking  problem,  were  performed  to  compare  its 
performance  against  traditional  Kalman  Filter  and  Extended  Kalman  Filter 
(EKF).  One  advantage  of  the  MHE  approach  is  that  prior  information,  such 
as  constraints  that  exist  in  applications  can  be  incorporated  in  the  design. 
The  results  showed  that  the  incorporation  of  constraints  in  the  MHE  ap¬ 
proach  outperformed  both  the  Kalman  Filter  and  EKF  when  there  is  poor 
prior  knowledge  of  the  process  and  measurement  noises. 

MHE  is  an  online  optimization-based  strategy  which  can  be  formulated 
as  a  Quadratic  Program(QP),  which  then  needs  to  be  solved  at  every  sam¬ 
pling  instance.  This  can  be  computationally  more  expensive  then  the  tra¬ 
ditional  Kalman  Filter(KF)  where  recursive  solution  is  available.  We  devel¬ 
oped  various  MHE  designs  and  implemented  them  on  the  Xilinx  Zynq  ZC702 
FPGA  board.  The  results  suggest  that,  for  modest  size  MHE  problems,  a 
hardware  solution  could  outperform  software  solution,  with  very  attractive 
speed,  clock  and  power  efficiency.  Instead  of  low  level  hardware  description 
language,  we  used  a  high-level  synthesis  tool  (Vivado  from  Xilinx)  for  proto¬ 
typing  so  that  the  design  can  be  easily  customized  for  different  applications 
and  for  design  space  explorations,  e.g.  trading-off  power,  hardware  resource 
and  computational  speed.  We  concluded  that  it  is  feasible  to  routinely  de¬ 
ploy  the  “MHE  on  a  chip”  technology  for  modest  size  MHE  problems  in 
embedded  and  real-time  applications  where  power  and  computational  re¬ 
sources  are  limited. 
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1  Introduction 


The  aim  of  this  work  is  to  investigate  the  feasibility  of  solving  Moving  Hori¬ 
zon  Estimation(MHE)  problems  in  hardware,  e.g.  Field  Programmable  Gate 
Arrays(FPGA),  in  contrast  to  using  software.  With  the  hardware  approach 
we  could  build  customized  computational  machines  for  specific  algorithms 
and  hence  gain  computational  and  power  efficiency,  which  is  crucial  for  em¬ 
bedded  real-time  applications  where  computational  resources  are  limited. 
A  hardware  approach  enables  one  to  build  certifiable  components  and  is 
also  less  susceptable  to  software  virus  attack.  The  ultimate  aim  of  this 
work  is  to  demonstrate  that,  by  using  high  level  synthesis  tools  instead  of 
low  level  hardware  description  languages,  it  is  feasible  to  routinely  deploy 
customised,  sophisticated  computational  algorithms  such  as  MHE,  on  re- 
configurable  hardware  for  real-time  and  embedded  applications. 

One  paper  was  published1  and  another  is  currently  under  review2.  Both 
papers  are  included  in  this  report.  The  appendex  contains  the  C  code  and 
photos  of  the  experimental  set  up. 

2  Results  and  Discussions 

2.1  MHE  is  more  robust  than  KF  or  EKF  against  inaccurate 
noise  statistics 

MHE  is  an  online  optimization-based  strategy,  unlike  Kalman  Filter  (KF) 
whose  design  depends  crucially  on  a  good  knowledge  of  the  statistical  char¬ 
acteristics  of  the  noise  in  the  system.  In  other  words,  KF’s  performance  is 
sensitive  to  the  accurate  assumption  of  the  noise  statistics.  Hence,  careful 
tuning  of  KF  is  usually  needed  when  deploying  KF  in  real,  practical  applica¬ 
tions.  One  advantage  of  the  MHE  approach  is  that  prior  information,  such 
as  constraints  that  exist  in  applications  can  be  incorporated  in  the  design. 

In  the  first  paper  of  this  report,  we  reviewed  one  formulation  of  the  MHE 
estimation  which  is  amenable  for  both  linear  and  non-linear  systems.  Nu¬ 
merical  simulations,  including  a  radar  tracking  problem,  were  performed  to 
compare  its  performance  against  traditional  Kalman  Filter  and  Extended 
Kalman  Filter  (EKF).  The  results  showed  that  the  MHE  approach  outper¬ 
formed  both  the  Kalman  Filter  and  EKF  when  there  is  poor  prior  knowledge 
of  the  process  and  measurement  noises.  This  confirmed  our  intuition  that, 
by  including  constraints  in  the  formulation,  MHE  has  the  advantage  over 

:D.  Zhou,  K.V.  Ling  and  E.K.  Poll,  Constrained  Moving  Horizon  Estimation  for  Lin¬ 
ear  and  Nonlinear  Systems,  3rd  SONDRA  Workshop,  10-14  June  2013,  Hyeres,  France, 
pp. 188-191. 

2T.  V.  Dang  and  K.V. Ling,  Moving  Horizon  Estimation  on  a  Chip,  submitted  to  In¬ 
ternational  Conference  on  Control,  Automation,  Robotics  and  Vision  (ICARCV),  10-12 
December  2014,  Singapore 
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KF  and  EKF  in  the  sense  that  the  design  is  less  sensitive  to  wrong  choices 
of  Q  and  R  (process  and  measurement  noise  variances  needed  in  KF  and 
EKF  designs) .  Robustness  of  the  design  against  inaccurate  prior  knowledge 
(e.g.  process  and  measurement  noise  variances)  is  important  for  practical 
deployment  of  the  MHE  technology. 

2.2  Implementation  of  MHE  on  FPGA 

The  next  paper  demonstrated  the  feasibility  of  implementing  MHE  algo¬ 
rithm  on  hardware.  In  a  larger  context,  it  demonstrated  that,  by  using  high 
level  synthesis  tools  instead  of  low  level  hardware  description  languages,  it 
is  feasible  to  routinely  deploy  customised,  sophisticated  computational  algo¬ 
rithms  such  as  MHE,  on  reconfigurable  hardware  for  real-time  and  embedded 
applications. 

MHE  can  be  formulated  as  a  Quadratic  Program(QP),  which  then  needs 
to  be  solved  at  every  sampling  instance.  Interior  Point  Method(IPM)  and 
Active  Set  Method(ASM)  are  two  popular  algorithms  for  solving  QP  prob¬ 
lems.  In  this  work,  however,  we  used  Alternating  Direction  Method  of  Mul¬ 
tipliers  (ADMM)  which  is  a  first  order  method.  ADMM  has  the  advantage 
that  it  has  a  simpler  computational  structure  and  hence  more  suitable  to 
exploit  the  computational  resources  available  in  FPGA  for  parallel  compu¬ 
tation,  although  ADMM  takes  more  iteration  steps  to  converge  than  second 
order  methods  such  as  IPM. 

Instead  of  implementing  a  generic  QP  solver,  we  specialized  to  MHE  with 
box-type  constraints  (a  commonly  encountered  constraint  type  in  practical 
applications),  so  that  the  ADMM  iterations  become  much  simplified  and 
involve  mainly  dot-product  computations  This  has  the  potential  of  achieving 
a  division-less  algorithm,  leading  to  fixed-point  only  computation;  division 
and  floating  point  computations  are  especially  time  and  resource  expensive 
in  hardware.  Instead  of  low  level  hardware  description  language,  we  used 
a  high-level  synthesis  tool  (Vivado  from  Xilinx)  for  prototyping  so  that  the 
design  can  be  easily  customized  for  different  applications  and  for  design  space 
explorations,  e.g.  trading-off  power,  hardware  resource  and  computational 
speed. 

We  developed  various  MHE  designs,  compared  floating  point  and  fixed 
point  implementations,  with  and  without  parallelism  (loop  pipelining),  on 
a  Xilinx  Zynq  ZC702  FPGA  board.  These  designs  clocked  at  50MHz.  The 
results  suggest  that,  for  modest  size  MHE  problems,  a  hardware  solution 
could  outperform  software  solution,  with  very  attractive  speed,  clock  and 
power  efficiency.  It  is  thus  feasible  to  deploy  the  “MHE  on  a  chip”  technol¬ 
ogy  for  modest  size  MHE  problems  in  embedded  and  real-time  applications 
where  power  and  computational  resources  are  limited. 
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Constrained  Moving  Horizon  Estimation  for  Linear  and  Nonlinear 

Systems 

Dexiang  Zhou,  K.V.  LING  and  E.K.  Poh 


Abstract — Moving  Horizon  Estimation  (MHE)  is  formulated 
as  a  constrained  optimization  problem  using  available  measure¬ 
ments  over  a  moving  window  interval.  This  approach  alleviates 
the  computational  complexity  and  naturally  incorporate  prior 
information  in  the  form  of  inequality  constraints  on  state  and 
noise.  In  this  paper,  we  review  one  formulation  of  the  MHE 
estimation  which  is  amenable  for  both  linear  and  non-linear 
systems.  Numerical  simulations,  including  a  radar  tracking 
problem,  are  performed  to  compare  its  performance  against 
traditional  Kalman  Filter  and  Extended  Kalman  Filter  (EKF). 
The  results  show  that  the  MHE  approach  outperforms  both 
filters  when  there  is  poor  prior  knowledge  of  the  process  and 
measurement  noises. 

I.  Introduction 

T  is  well  known  that  the  Kalman  Filter  is  equivalent  to 
least-squares  estimation  when  the  system  is  linear  and  the 
process  and  measurement  noises  are  mutually  independent 
Gaussian  noises  [1],  Analytic  or  recursive  solutions  are 
available  for  the  least-square  estimation  or  Kalman  Filter 
when  constraints  are  not  considered.  In  applications,  limits 
or  constraints  on  state  variables  are  often  known  a  priori 
and  can  be  expressed  as  inequalities.  For  example  tempera¬ 
ture,  pressure,  flow  rates,  must  be  non-negative  and  cannot 
go  above  some  upper  bound.  However,  with  inequalities 
constraints,  the  estimation  of  state  becomes  a  constrained 
optimisation  problem  where  no  analytical  solution  may  exist. 
With  increasing  time  horizon,  the  constrained  optimisation 
problem  becomes  computationally  intractable,  thus  the  idea 
of  Moving  Horizon  Estimation  (MHE)  is  used  to  limit  the 
number  of  decision  variables.  MHE  is  a  kind  of  optimization- 
based  state  estimation  in  which  an  estimation  cost  function 
is  optimized  and  the  measurements  in  a  moving  window  are 
used  to  estimate  the  state.  The  estimation  cost  function  often 
includes  the  error  between  true  and  estimation  of  the  state, 
the  error  between  measurement  and  predicted  output  and  the 
arrival  cost  which  summaries  the  past  measurements  outside 
of  the  moving  window. 

The  relationship  between  full  information  and  receding 
horizon  estimation  is  investigated  in  [2].  A  moving  horizon- 
based  approach  for  least-squares  estimation  was  proposed 
to  solve  the  issue  of  computational  intractable  when  the 
constraints  are  taken  into  account  [3],  The  duality  between 
control  and  estimation  is  well  known  in  the  context  of 

This  work  was  supported  by  a  grant  from  Asian  Office  of  Aerospace 
Research  and  Development  (Ref: AOARF- 124004).  Dexiang  Zhou,  K.V. 
LING  and  E.K.  Poh  are  with  Division  of  Control  and  Instrument,  School 
of  Electrical  and  Electronic  Engineering,  Nanyang  Technological  Uni¬ 
versity,  Singapore,  639798,  Singapore,  (e-mail:  ZHOU0180@e.ntu.edu.sg; 
EKVLING@ntu.edu. sg;  eekpoh@ntu.edu.sg  ) 


Linear  Quadratic  Regulator  and  the  Kalman  Filter.  For 
constrained  control  and  estimation  their  Lagrangian  duality 
is  investigated  in  [4].  Constrained  MHE  for  linear  system 
and  the  conditions  of  nominal  asymptotically  stability  are 
investigated  in  [5].  General  constrained  MHE  for  nonlinear 
system  is  proposed  in  [6]. 

Constrained  MHE  uses  the  constraints  (often  in  the  form 
of  inequalities  of  states  and  noise)  to  prevent  the  estimator 
from  producing  values  which  are  not  reasonable  (i.e.  the 
values  which  violate  the  constraints).  Thus  good  estimation 
performance  is  often  obtained  using  constrained  MHE.  In 
this  paper  constrained  MHE  for  linear  and  nonlinear  systems 
state  estimation  is  investigated.  The  purpose  of  the  paper  is 
to  demonstrate  that  the  constrained  MHE  is  more  robust  than 
Kalman  Filter  and  EKF  when  uncertainty  exists,  in  particular, 
when  we  have  poor  prior  knowledge  about  the  process  and 
measurement  noises. 

This  paper  is  organized  as  follows.  In  Section  2  we 
introduce  the  constrained  MHE  which  is  suitable  for  both 
linear  and  non-linear  systems.  In  Section  3  approximations 
of  arrival  costs  of  constrained  MHE  for  linear  and  nonlin¬ 
ear  systems  are  introduced.  In  Section  4  the  advantage  of 
constrained  MHE  is  shown  comparing  to  Kalman  Filter  and 
EKF  using  simulations.  Section  5  concludes  this  paper. 


II.  Constrained  MHE 

Consider  the  following  system 

xk+i  =  f  [k,xk]  +  uk 

yk  =  h  [k,  xk\  +  ek  (1) 

The  process  noise  ojk  and  measurement  noise  ek  are  assumed 
additive,  zero  mean  and  white  noises.  The  initial  state,  with 
estimate  Xq,  an  approximation  mean,  and  the  associated 
covariance  matrix  Pq  which  is  positive  definite,  is  assumed 
to  be  uncorrelated  with  the  two  noise  sequences. 

We  assume  the  system  (1)  are  subject  to  the  following 
constraints: 

Xk  (E  1^x5  C  Ho;  5  €k  (E  (2) 

where  and  il,.  are  polytopic  sets  containing  the 

origin. 

We  formulate  a  constrained  state  estimation  problem  as 
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the  follows: 

Jt  '■  &t  =  min  <Pt 

So,{<t>fc}fc=0 

s.t.  Xk+1  =  f  [k,Xk\  +  Uk, 
yk  =  h  [k,xk\  +  ek, 

Xk  €  bJk  £  Hu j,  &k  £  He.  (3) 

where 

T 

&T  =  53  +  ^kQ^^k) 

k= 0 

+  (x0  -  x0)  P^txo  -  x0)  (4) 

and  {oJk}k=o  denotes  the  sequence  {w0,  wi,  •••  ,  wt}-  Q 
and  R  are  the  design  choices  of  variances  of  the  process 
and  measurement  noises.  The  solution  of  Jt  at  time  T  is 
x°]T  and  {u)k\T}k=o-  The  current  estimation  x^T  is  the  state 
evolution  of  the  system  (1)  driven  by  the  {^k\T}k=o  from 
the  initial  state  Xq\t- 

The  above  estimation  is  full  information  estimation  with¬ 
out  dropping  old  measurements.  This  optimization  problem 
becomes  computational  intractable  as  T  increases.  MHE 
removes  this  difficulty  by  considering  only  the  most  recent  N 
measurements  {yk}r-N+ i  to  estimate  states  {x^t-n+v 
We  reformulate  the  object  function  (4)  by  breaking  the  time 
interval  into  two  pieces: 

T 

&T  =  ^2 

k=T-N+ 1 
T-N 

( &kR  i&k+&kQ  i&ksj 

k—O 

+  (x0  -  x0)  Po\x0  -  Xo)  (5) 

Because  we  use  the  state  space  model  of  the  system  the 
quality  of  the  first  term  of  (5)  only  depends  on  Xt-n+i 
and  sequences  {uk}k=T~N+i>  {4}Lt-at+i-  bY  usin8  for¬ 
ward  dynamic  programming  the  full  information  estimation 
(3)  with  the  cost  function  (5)  can  be  reformulated  as  the 
following  equivalence  with  moving  horizon  window: 

T 

Jt  ■  J>t  =  min  (efcf?_1efc 

S'Z’-JV  +  lT't’fclfc—T-N+l  k_T—N+l 

+  Q}kQ~1(bk)  +  9t-n+i(xt-n+i) 
s.t.  xk+i  =  f[k,xk\ 

Vk  =  h  [k,  bjk]  +  ek, 

Xk  ,  (jJk  £  ,  &k  £  He  (6) 

where 

Ot(z)  =  min  i  ■ 

*o,{Ok}Z~Z 

x(T;x0,{cbk}lZo)  =  z}  (7) 

where  the  minimization  is  subject  to  the  constraints  (2)  and 
x(T :  xo,  (wfelfc^o1)  denotes  the  state  evolution  of  the  system 
(1)  when  the  initial  sate  is  5:+  and  the  process  noise  is 

{UkYkZl  11  follows  that  Mxo)  =  (xq  —  Xoj  T’o”1  (io  —  Xo). 


The  arrival  cost  9t-n+i{xt-n+i )  denotes  the  contribution 
of  the  measurements  {yk}k=o~N  on  die  estimation  of  state 
xt-n+i  and  allows  us  to  formulate  the  full  information  esti¬ 
mation  as  MHE.  For  the  majority  of  systems  with  constraints, 
we  can  not  obtain  analytical  expression  for  the  arrival  cost. 
Because  full  information  estimation  has  very  good  theoret¬ 
ical  properties  in  terms  of  stability  and  optimality  [7]  we 
formulate  MHE  as  close  as  possible  to  the  full  information 
estimation  by  designing  a  suitable  arrival  cost. 

III.  Approximation  of  the  Arrival  Cost 

One  reasonable  solution  is  to  approximate  the  arrival 
cost  for  constrained  estimation  problem  with  the  arrival 
cost  for  the  unconstrained  estimation  problem.  Here  two 
approximations  of  the  arrival  cost  for  linear  system  and 
nonlinear  system  are  introduced  respectively. 

For  linear  system  it  is  assumed  that  the  functions  /  [k,  xk] 
and  h  [/c,  xk]  are  defined  by 

f  [kj  Xk\  .  AkXkl  h  [k7  Xk\  ■  C kXk • 

The  unconstrained  full  information  estimation  is  known  as 
Kalman  Filter.  Kalman  filtering  covariance  update  formula 
is 

Pk  =  Q  +  ^4fc-i-Pfc-i^4fc_i 

-  Ak-iPk-iC'k(R  +  CkPk-iC'^CkPk-iA^  (8) 
with  initial  condition  Pq  and  its  arrival  cost  is  expressed  as 

Ok(z)  =  {z-  x0k\k-i)'PkHz  -  x°k\k-\)  +  &k- i  (9) 

In  constrained  MHE  for  linear  system  the  arrival  cost  is 
approximated  by  the  arrival  cost  of  Kalman  Filter  (i.e.  (9)). 

For  nonlinear  system  one  strategy  to  approximate  the 
arrival  cost  9k{-)  is  to  employ  the  first-order  Taylor  se¬ 
ries  approximation  of  the  system  (1)  around  the  estimated 
trajectory.  This  strategy  yields  the  Extended  Kalman  Filter 
covariance  update  formula.  Suppose  the  model  functions 
/  [k,  xk]  and  h  [. k ,  xk\  are  sufficiently  smooth.  Then  the 
arrival  cost  is  approximated  as 

Ok(z)  =  (Z  -  x^yp^iz  -  x%™)  +  3*k_ r  (10) 

where  P3  sequence  is  obtained  by  solving  the  matrix  Riccati 
equation  (8)  in  which 

4  df(k)  I  r  dh(k)  I 

k  =  Ck  = 

and  the  superscript  MHE  means  that  the  state  estimation  is 
obtained  by  MHE  . 

For  T  >  N  we  formulate  the  implementation  of  con¬ 
strained  MHE  as  the  solution  to  the  following  least-squares 
program  which  approximates  to  the  estimation  problem  Jt'. 

Jt  '■  $ t  =  min  <Ixt 

XT—N+  lJ<Z^y^=T—N  +  l 

s.t.  Xk+1  =  f  [k,Xk]  +  u>k, 

yk  =  h  [k,  xk]  +  ek, 

Xk  €  Qx ,  ^k  ^  ,  &k  €  (11) 
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where  the  objective  function  is  defined  as 

T 

@T  =  ^2  (efc-R_1efc  +  tu,kQ-1LJkS) 

k=T—N+l 

+  9t-n+i{£t-n+i)  (12) 

Note  that  the  arrival  cost  is  approximated  as  (10)  including 
the  linear  and  nonlinear  cases. 

IV.  Illustrate  Example 

In  this  section  we  show,  with  simulation  examples,  the 
advantage  of  constrained  MHE  comparing  to  Kalman  Filter 
and  EKF  when  the  process  and  measurement  noise  variances 
are  not  chosen  correctly.  In  the  following  simulations  it  is 
assumed  that  process  and  measurement  noises  are  white 
Gaussian  noises.  And  we  denote  the  ith  element  of  a  vector 
z  in  time  instant  k  as  Zitk- 


Fig.  1.  Estimation  performance  with  Q  =  10  x  1 4  and  R  =  5 


and 

E  )  =  0.1  x  /4,  E  (ekel)  =  5. 


TABLE  I 

Performance  Comparison  of  Kalman  Filter  and  MHE 


Kalman  Filter  results  is  an  optimal  estimation  for  linear 
system  when  Q  and  R  are  chosen  to  match  process  and 
measurement  noises.  However,  the  variances  of  process  and 
measurement  noises  are  often  not  known  exactly  in  practice 
and  Q  and  R  are  often  used  as  tuning  parameters  of  the 
Kalman  filter. 

Figure  1  and  2  show  that  the  Kalman  Filter  could  produce 
estimates  that  violate  the  state  constraints  when  Q  and  R  are 
not  chosen  correctly  while  the  constrained  MHE  gives  state 
estimates  that  are  within  the  constraints.  The  reason  is  that 
the  prior  information  of  constraints  are  taken  into  account  in 
the  constrained  MHE  while  Kalman  Filter  does  not. 

Also  in  Table  I  we  use  the  criterion: 

T  T 

Jkf  =  y~l(si,fc  -  x°  k\k)2,  Jmhe  =  y~l(Ei,fc  -  x™k\k)2 

k= 1  k— 1 

to  compare  the  estimation  performance  of  Kalman  Filter  and 
MHE  when  the  variances  of  process  and  measurement  noises 


Q  =  ql±  with  q 

R 

Kalman  Filter 

MHE  wih  N  =  5 

0.1 

5 

13.9794 

13.9794 

10 

5 

166.6161 

37.1588 

10 

1 

194.8563 

37.5834 

100 

1 

203.5847 

37.7831 

are  not  chosen  correctly.  The  correct  values  are  q  =  0.1  and 
R  =  5  (row  1).  When  this  is  the  case,  Kalman  Filter  gives 
the  optimal  estimates  and  the  performance  of  Kalman  Filter 
and  constrained  MHE  is  almost  the  same.  The  performance 
of  Kalman  Filter  deteriorates  significantly  when  the  values 
of  q  and  R  are  inaccurate  (rows  2  to  4).  The  constrained 
MHE,  however,  is  less  sensitive  to  the  values  of  q  or  R. 
Thus  constrained  MHE  is  robust  to  the  uncertain  information, 
in  particular  to  the  variances  of  process  and  measurement 
noises. 
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B.  Performance  Comparison  of  EKF  and  Constrained  MHE 
Consider  a  radar  tracking  application 


Xk-\- 1  — 


Vk  = 


I  1  0  O' 
0  10  0 
0  0  11 
0  0  0  1 


Xk  +  Wfc 


1  ,k  '  3,k 

arctan 

Xl  ,k  J 


where  uik  and  e*.  are  zero  means  Gaussian  noise  with 
variance 


E  (ukul) 


"1  0  0  0  ' 

0  0.01  0  0 

0  0  10 
0  0  0  0.01 


E  {e-kel) 


The  constraints  are 


9  0 
0  1 


Fig.  4.  Estimation  of  Velocities  using  EKF  and  constrained  MHE  when 
Q  and  R  are  inaccurate 


140  ^  %2,k->  %4,k  ^  210. 


V.  Conclusion 


The  initial  state  is  xo  —  [0  200  0  200]'  and  we  chose 
P0  =  h,  N  =  40. 

Figure  3  shows  that  the  performance  of  EKF  and  con¬ 
strained  MHE  are  almost  the  same  when  Q  and  R  are 
chosen  correctly.  Constrained  MHE  is  able  to  keep  the  state 
estimates  within  the  constraints  while  the  EKF  is  unable  to 
do  so  shown  in  Figure  4  when  the  values  of  Q  and  R  are 
not  chosen  correctly.  The  inaccurate  values  of  Q  and  R  used 


0.09  0 


in  Figure  4  are 


Q  = 


1  0  0  0' 
0  10  0  0 
0  0  10 
0  0  0  1 


In  this  paper,  we  review  the  implementation  of  constrained 
MHE  which  is  applicable  to  both  linear  and  nonlinear  sys¬ 
tems.  The  constrained  MHE  is  formulated  where  the  arrival 
cost  is  approximated  by  the  solution  to  a  Kalman  Filter 
for  a  linear  estimation  problem  and  by  EKF  for  nonlinear 
estimation.  Two  illustrative  examples,  including  one  on  a 
typical  non-linear  radar  tracking  problem,  were  studied  and 
the  simulation  results  demonstrate  that  the  constrained  MHE 
exhibits  robust  performance  when  there  is  poor  prior  knowl¬ 
edge  of  the  process  and  measurement  noises  as  compared 
to  stand-alone  Kalman  Filter  and  EKF.  Our  future  work  will 
consider  deploying  the  proposed  MHE  strategy  for  real-time 
applications  where  computational  resources  are  limited.  We 
aim  to  devise  algorithms  that  could  exploit  the  structure 
of  the  specific  problems  and  types  of  nonlinearity  to  gain 
computational  advantage. 


Fig.  3.  Estimation  of  Velocities  using  EKF  and  constrained  MHE  when 
Q  and  R  are  chosen  correctly 
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Abstract — Second  order  Quadratic  Programming  (QP)  solvers 
such  as  interior-point  method  (IPM)  require  the  solution  of  a 
system  of  linear  equations  at  every  iteration  and  could  be  a  factor 
limiting  the  implementation  of  IPM  to  miniaturized  devices  or 
embedded  systems.  In  contrast,  first  order  QP  solvers  such  as 
alternating  direction  method  of  multipliers  (ADMM)  does  not  re¬ 
quire  the  solution  of  a  system  of  linear  equations.  Thus  first  order 
QP  solver  is  cheaper  and  easier  to  be  implemented  in  embedded 
systems  such  as  FPGA  which  has  limited  hardware  resources.  In 
this  paper  an  FPGA  implementation  of  ADMM  which  solves  QP 
problems  arising  from  Moving  Horizon  Estimation  is  proposed 
to  demonstrate  the  “MHE  on  a  Chip”  idea.  Our  design  has  been 
implemented  in  both  fixed-point  and  floating  point  arithmetic  on 
the  Xilinx  Zynq-7000  XC7Z020-1CLG484C  AP  SoC  and  clocks 
at  50  MHz. 

I.  Introduction 

Interior  Point  Method  (IPM)  and  Active  Set  Method  (ASM) 
are  two  commonly  employed  approaches  for  solving  QP 
problems.  In  every  iteration  of  IPM  and  ASM,  the  solution 
of  a  system  of  linear  equations  is  required  and  this  could 
be  a  factor  limiting  the  implementation  IPM  or  ASM  to 
miniaturized  devices  or  embedded  systems.  Recently  first  order 
QP  solvers  receive  significant  interests  [l]-[3]because  of  its 
simple  computational  structure  and  low  cost  implementation. 
In  contrast  to  the  IPM  and  ASM,  first  order  QP  solver  such  as 
alternating  direction  method  of  multipliers  (ADMM)  does  not 
require  the  solution  of  a  system  of  linear  equations  and  this 
makes  first  order  QP  solver  easier  and  cheaper  to  implement 
on  embedded  systems  such  as  Field  Programmable  Gate  Array 
(FPGA). 

Recent  advances  in  reconfigurable  hardware  technology 
have  made  FPGA  becoming  more  popular  in  a  wide  range 
of  embedded  applications.  Implementation  with  FPGA  brings 
a  shorter  design  cycle  and  greater  flexibility  comparing  with 
ASIC.  Moreover,  high  speed  computation  can  be  achieved 
through  parallelization  and  customization  to  meet  timing  re¬ 
quirement. 

There  have  been  several  previous  FPGA  implementations  of 
QP  solvers  [4]— [6] .  In  [6]  QP  solver  based  on  infeasible  IPM 
has  been  implemented  on  FPGA  for  Model  Predictive  Control 
(MPC).  The  comparison  of  implementation  of  IPM  and  ASM 
on  FPGA  for  MPC  was  discussed  in  [7],  In  [8]  the  authors 
proposed  that  the  FPGA  can  provide  substantial  accelerating 
by  exploiting  the  parallelism  inherent  in  Multiplexed  MPC 
(MMPC).  All  these  papers  implemented  the  QP  solvers  which 
requires  the  solution  of  system  of  linear  equations.  Similar  to 


Fast  Gradient  Method,  ADMM  is  also  a  first  order  method  that 
has  simple  computational  structures  which  allow  efficient  par¬ 
allelism.  In  addition,  the  method  are  division-free  making  the 
implementation  efficiently  since  division  is  a  costly  operation 
in  FPGA  design,  in  term  of  resource  and  latency,  comparing 
with  other  operations  such  as  multiplication  or  addition  . 

Using  high-level  synthesis  (HLS)  tools,  such  as  Vivado 
HLS  from  Xilinx  or  Synphony  C  Compiler  from  Synopsys 
[9],  for  implementation  with  FPGA  brings  lots  of  advantages. 
Designing  at  a  higher  abstraction  level  makes  it  possible  to 
handle  the  increasing  complexity  of  embedded  applications. 
For  those  who  do  not  have  the  training  to  work  with  hardware 
description  languages  (HDLs)  to  perform  a  register  transfer 
level  (RTL)  description  of  the  hardware,  HLS  tools  offer  an 
alternative  approach.  HLS  tools  also  facilitate  rapid  proto¬ 
typing.  FPGA  design  and  implementation  by  HLS  not  only 
significantly  saves  time,  but  also  reduces  the  risk  of  mistakes. 
For  the  implementation  of  ADMM-based  QP  Solver  on  FPGA, 
we  used  Vivado  HLS  and  System  Generator  to  generate  the 
bit  stream  from  a  C-based  design. 

MHE  is  a  kind  of  optimization-based  state  estimation 
where  an  on-line  constrained  optimization  is  solved  and  the 
measurements  in  a  moving  window  are  used  to  estimate  the 
state.  In  applications,  limits  or  constraints  on  state  variables  are 
often  known  a  priori  and  can  be  expressed  as  inequalities.  For 
example  temperature,  pressure,  flow  rates,  and  concentration, 
must  be  non-negative  and  cannot  go  above  some  upper  bound. 
Constrained  MHE  can  include  the  prior  information  (i.e. 
constraints),  non-Gaussian  noises  [10]  and  nonlinear  system 
without  linearisation  [11],  The  parallel  computing  architecture 
of  FPGA  can  accelerate  the  computation  to  solve  on-line 
constrained  optimization.  In  this  paper  a  ADMM-based  QP 
solver  which  solves  on-line  constrained  optimization  problems 
is  implemented  in  FPGA  with  the  aim  to  extend  MHE  to 
embedded  applications. 

This  paper  is  organized  as  follows.  In  Section  II  we 
summarise  the  method  of  using  ADMM  to  solve  MHE  for 
linear  system.  The  procedure  for  designing  and  arriving  at  a 
FPGA  implementation  of  ADMM-based  MHE  is  described  in 
Section  III.  Section  IV  discusses  the  approaches  to  accelerate 
computation  for  our  FPGA  designs.  In  Section  V,  an  example 
is  used  to  illustrate  our  designs.  Conclusions  and  future  work 
are  given  in  Section  VI. 
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II.  ADMM  METHOD  FOR  MHE 
A.  QP  formulation  of  MHE 

Consider  the  following  linear  system 
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0  0  In  0 
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xk+i  =  Axk  +  Buk  +  (1) 

Vk  =  Cxk  +  ek  (2) 

where  xk  £  Kn  is  the  state  vector,  yk  £  is  the  measure¬ 
ment.  The  process  noise  uik  and  measurement  noise  ek  are 
assumed  additive,  zero  mean  and  white  noises  with  variances 
Q  and  R  respectively.  The  initial  state,  with  estimate  .cq,  an 
approximation  mean,  and  the  associated  covariance  matrix  Pq 
which  is  positive  definite,  is  assumed  to  be  uncorrelated  with 
the  two  noise  sequences. 

We  assume  that  the  state  and  noise  vectors  in  fl)  are  subject 
to  the  following  constraints: 

Xk  £  ^x>  tUfc  £  (3) 


where  ttx  and  are  polytopic  sets  containing  the  origin. 

Given  the  N  most  recent  measurements  {ykYk=T-N+i  at 
T,  the  MHE  is  defined  as 


min 

&T-N+ 1 , 
{^fc}  k  =  T  -  N  +  l 
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+  o  AT+i  -  At-JV+i||o-i 

Z  ^T-N+l 

s.t.  xk+i  =  Axk  +  Buk  +  Wfc, 

Vk  —  CXk  T  Gfc? 

Xk  £  ^xi  (4) 


Lo  0  0  ■■■  A  -In  0  0  In] 

and 

$  =  -TtA,  (9) 

P”1  n 

rT-N+ 1  u 

0  In  0  R  1 

(0  is  Kronecker  product,  Im  is  denoted  as  the  identity  matrix 
with  to  of  dimension,) 

The  constrained  optimization  problem  (4)  can  be  formulated 
as  following  standard  form: 

min  -zTHz  +  hTz  (10) 

z  2 

s.t.  Gz  =  g,  (11) 

z  £  K,  (defined  as  Eq.  (7))  (12) 

The  size  of  H  is  (2 N  —  l)n  x  (21V  —  1  )n  and  QP  size  is 
defined  as 

QPSize  =  (21V  -  l)n  (13) 

B.  ADMM  for  MHE 

The  ADMM  algorithm  [12]  for  the  MHE  QP  problem  (10) 
is  as  follow  [13]: 

—  Choose  initial  condition  z0,  do ,Tq 

Do 


r  = 


In 

IN®C 


A  = 


where 

Pk.  =Q  +  APk-iA' 

-  APk-\C'{R  +  CPk-iC')-lCPk-iA'  (5) 
Pt-n+i  =Axf_N  +  But-n  (6) 

and  xf_N  *s  the  estimation  of  x  at  T  —  N. 

By  defining: 


1.  0i+1  =  M11(— h  —  Ti  +  pz±)  +  M12g  (14) 

2.  Zi+1  =  7TK(6»i+i  +  1  Ti)  (15) 

P 

3.  ri+1  =  pdi+1  +  Ti  -  pzi+1  (16) 

4-  £  =  H^i+l  —  ||  2 

While(e  >  e0) 

(17) 


z  =  \x. 
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where  Mu  and  M-\  2  are  computed  offline  as 


H  +  pI(2N—  l)n 

GT~ 

-1 

Mu 

M12 

G 

0 

<2 

M22 

with  the  notation  of  projection  operation: 

TtK(tfe)  =  arg  min  ||z-tfe||2  (19) 

In  MHE  application  constraints  are  usually  imposed  to  restrict 
the  minimum  and  maximum  values  of  internal  states  or  noises. 
For  simplicity  of  FPGA  implementation  we  assume  the  con¬ 
straints  Clx  and  are  box  type  constraints,  i.e., 

—  {%k  :  Zmin  ^  Xk  f  ^max}  (20) 

Hqj  —  {uJk  :  Wmin  ^  i Ok  f  tUmax}  (21) 
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Consequence,  K  is  box:  {K  :  zmin  <  z  <  zmax  }. 
where 


1 N  &  *^min 

1 N  ®  ^max 

•^min  — 

liV-1  <8>  ^min 

i  ^max  — 

liV— 1  <8>  ^max 

and  lm  is  the  m  x  1  vector  of  ones.  With  the  box  type 
constraints  assumption  the  solution  (15)  reduces  to 


Zi+ 1  =  max 


L  . 

A  ^mirn 


min 


L 

t  ^maxi 


(22) 


As  a  summary,  MHE  with  box  constraints  using  ADMM 
algorithm  is: 


Algorithm  1 

—  Compute  Mn,  M12  with  a  p  >  0  as  (18) 
—  Choose  initial  condition  z0,  90  £  K, 
t0  =  0,  e0  :  Stopping  Criterion 
—While  (new  measurement  is  available) 
Compute  h  and  g  as  (8),  M12g 
Do 

1-  ^i+l  =  Mll(— h  —  Ti  +  pz±)  +  Mi2g 

2-  z1+i  = 

max  |zmin,  min  jzmax,  (oi+1  +  ^r± 

3.  7i_i_i  =  pdi  +  i  +  Ti  —  pzi+1 

4.  e  =  ||0i+1  -  6»1 1|2 
while(£  >  £o) 

—  end  While 


(23) 

(24) 

(25) 

(26) 


III.  FPGA  Implementation 

To  implement  a  ADMM-based  QP  solver  for  MHE  on 
FPGA,  we  use  Vivado  HLS,  Xilinx  System  Generator  and 
MATLAB/Simulink  as  development  tools,  and  the  Zynq 
ZC702  board  [14]-[16]. 

Vivado  HLS,  a  high-level  synthesis  tool,  allows  the 
designer  to  describe  the  computational  algorithm  using  a 
high-level  language,  commonly  C/C++/SystemC,  to  generate 
automatically  the  register  transfer  level  (RTL)  description, 
e.g.  VHDL  or  Verilog,  for  FPGA  implementation.  Especially 
for  those  whose  main  expertise  is  in  control  system  design, 
writing  algorithms  in  C  is  easier  and  more  familiar  than 
using  hardware  description  languages.  Therefore,  high-level 
synthesis  tools  accelerate  the  design  process  for  complicated 
algorithms,  and  significantly  reduce  design  time.  The  main 
steps  to  implement  our  ADMM-based  QP  sovler  on  FPGA 
for  MHE  problems  is  shown  in  Figure  1  and  is  described  next. 


Step  1:  C/C++  Coding 

The  MHE  with  box  constraints  using  ADMM  algorithm 
(Algorithm  1)  is  written  in  C  where  the  ADMM  part  is  shown 
in  Figure  2. 


Step  2:  Synthesis  and  RTL  export  with  Vivado  HLS 


e 

Generator 


C/C++  Algorithm 


ADMM-based  QP  Solver 


Datatype:  floating  point 
Performance  optimization 
Loop  unroll.  Pipeline 
RTL  export 


System 
Generator  / 
Simulink 


Bit  stream  generation 
Interface  Configuration 


Hardware  Co-sim  Block  for 
interface  with  FPGA  block 


Bit  stream  for  FPGA 
Configuration 


Fig.  1.  Design  flow  for  FPGA  Implementation 


ADMM_Loop:  do 

t 

//  Eq.  (23) 

Loop_l:  for(i=0ji<size j i++) 

{ 

#pragma  HLS  PIPELINE  //Loop  pipelining  directive 

temp=0j 

Loop_l.l:  for( j=0 j j<sizej j++) 

t 

temp=temp+Mll[i][ j]*(-h[ j]+ro*(zK[j] -uK[i])) j 

> 

xKl[i]=temp+M12g[i] j 
zTemp[i] =xKl[i]+uK[ i] ; 

} 

//  Projection  Eq.  (24) 

Loop_2: 

"  ( 

#pragma  HLS  PIPELINE  //Loop  pipelining  directive 

//Code  omitted  for  briefly 
} 

//Eq  (25)  and  (26),  and  update 
Loop  3: 

{ 

#pragma  HLS  PIPELINE  //Loop  pipelining  directive 

//Code  omitted  for  briefly 

} 

loopCounter=loopCounter+l; 

> 

while(epsilon>epsilon_0) j 


Fig.  2.  ADMM  C  code  in  the  MHE  with  box  constraints 


Vivado  HLS  supports  floating-point  (single  or  double  precision 
IEEE-754  standard  compliance)  or  fixed-point  arithmetic  for 
synthesis.  In  addition,  Vivado  HLS  allows  designers  to 
customize  the  design  by  adding  synthesis  directives/constraint 
such  as  loop-unroll,  pipeline,  resource  constraint,  etc  for 
performance  optimization.  The  design  is  then  synthesized  and 
exported  as  a  RTL  module  with  different  options  for  further 
processing  with  other  Xilinx  tools. 
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Step  3:  Bit  stream  and  Hardware  co-simulation  block 
generation  with  System  Generator 

To  verify  that  the  ADMM  algorithm  is  implemented  correctly, 
the  RTL  module  generated  from  Vivado  HLS  is  imported  to 
System  Generator  (SysGen)  which  is  integrated  with  Simulink. 
SysGen  generates  the  bit-stream  for  FPGA  configuration,  and 
a  Hardware  co-sim  block  for  the  interfacing  between  FPGA 
and  Simulink  environment  [14]. 

Step  4:  Implementation  on  FPGA 

Through  the  hardware  co-sim  block  generated  from  SysGen, 
the  bit  stream  is  downloaded  to  the  FPGA  board.  This 
block  allows  Simulink  and  the  FPGA  board  to  be  connected 
directly  so  that  FPGA-in-the-Loop  simulation  can  be  carried 
out  as  shown  in  Figure  3.  In  this  simulation  scheme,  QP 
problems  of  MHE  at  each  sample  time  are  sent  to  FPGA 
board  via  JTAG  communication  from  Simulink.  After 
completing  the  computation,  FPGA  board  sent  back  the  result 
to  Simulink,  and  wait  for  the  next  computation  task.  The 
interface  setting  between  the  FPGA  board  and  Simulink  is 
defined  by  the  Hardware  Co-sim  block.  In  the  simulation,  the 
FPGA  clocking  mode  is  free  running.  The  synchronization 
mechanism  (Figure  4)  is  as  follow  :  the  Logical  Control 
block  generates  the  Start  signal  periodically  (Ts),  this  signal 
triggers  the  FPGA  starting  operation;  when  FPGA  is  busy 
the  Idle  signal  goes  low,  and  goes  high  when  the  FPGA  has 
completed  the  computation.  Based  on  the  rising  edge  of  the 
Idle  signal,  the  Buffer  capture  the  solution. 


Fig.  3.  Hardware  Co-simulation  with  FPGA 
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n 

1  S 
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J 

l  Xt  ) 

Capture  Capture 


Fig.  4.  Synchronization  mechanism 


IV.  Computational  Acceleration 

Since  our  proposed  ADMM  for  MHE  (Algorithm  1)  is 
division  free,  it  allows  an  efficient  fixed-point  arithmetic  im¬ 
plementation  which  will  run  faster  than  a  floating  point  de¬ 
sign.  Moreover,  ADMM  consist  loops  involving  in  computing 
matrix-vector  product,  loop  pipeline  technique  was  employ  to 
exploit  the  parallelism.  The  efficiency  of  fixed-point  arithmetic 
and  loop  pipelining  are  discussed  in  Section  V. 

A.  Fixed-point  Arithmetic  Implementation 

Comparing  with  floating  point  arithmetic,  the  circuitry  for 
fixed-point  computations  is  much  more  simple  and  faster,  but 
the  dynamic  range  is  limited.  The  design  of  fixed  point  data 
type  is  based  on  the  accuracy  required,  and  the  dynamic  range 
of  the  variable  involving  in  the  algorithm.  Since  there  have  not 
been  any  general  results  on  establishing  an  analytical  bound  for 
ADMM,  it  must  estimate  the  bounds  through  simulation  study 
for  a  particular  QP  problem.  In  our  particular  MHE  example 
shown  in  Section  V,  through  simulation  we  got  the  lower  and 
upper  bound  for  dynamic  variables  (Table  1).  The  fixed  point 
data  type  is  chosen  (Table  I)  to  ensure  the  required  accuracy 
and  dynamic  range.  For  an  accuracy  of  e  =  10-6,  about 
(6(n(10))/(n(2)  =  20  fractional  bits  is  required.  Looking 

at  the  different  range  of  the  variables,  ideally  different  word 
and  fractional  bit  length  fixed-point  representation  should  be 
used.  Since  the  DSP48  block  of  Xilinx  Zynq  ZC702  consists 
of  a  25  x  18  bit  multiplier,  we  chose  25  and  18  bits  long  data 
types  for  variables  involving  in  the  multiplication  operation. 
The  use  of  different  fixed-point  data  type  is  a  subject  of  the 
future  work.  In  order  to  show  the  effectiveness  of  fixed  point 
arithmetic,  we  also  implemented  the  algorithm  in  floating  point 
(single  precision  32  bit  IEEE  754  standard)  for  resouces  and 
timing  comparison  (Section  V). 

TABLE  I 

Fixed  point  designation  (N  =  5,  n  =  4) 


Variable 

Range 

Word 

Fraction 

Mn,Mi2 

(-0.02,0.09) 

25 

19 

z 

(-0.12,4.3) 

25 

19 

e 

(-0.12,4.56) 

25 

19 

T 

(0,1.31) 

25 

19 

h 

(-15.02,0) 

25 

19 

9 

(-1.1) 

25 

19 

ro,  1/ro 

ro  =  4, 1/ro  =  0.25 

18 

10 

Intermediate 

(-10,10) 

25 

19 

B.  Loop  pipelining 

To  exploit  the  parallelism,  loop  pipelining  (LPP)  technique 
was  employed.  The  computation  is  in  a  parallel  manner.  As  a 
result,  this  reduces  the  latency,  but  the  cost  is  more  hardware 
resource  and  hence  chip  area.  To  apply  LPP,  Vivado  HLS 
provides  compiler  directive:  set_directive_pipeline 
<code  section>  or  #  pragma  HLS  pipeline 

[Option]  [14].  In  default  option,  Vivado  HLS 
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automatically  control  the  Initiation  Interval  (II)  which  is 
the  number  of  cycles  between  each  new  execution  of  the  loop 
body,  to  achieve  minimum  latency.  Figure  5  illustrates  LPP 
mechanism.  If  a  loop  need  Q  iterations,  each  iteration  takes 
N  cycles  to  complete,  then  it  take  NO  cycles  to  complete 
(  Figure  5a).  When  employ  LPP,  the  latency  of  the  loop  is 
( Q  —  1)11  +  N  cycles  (  Figure  5b).  The  lower  the  II  is, 
the  smaller  latency  it  achieves.  In  our  design,  three  loops 
of  the  ADMM  algorithm  is  applied  pipeline  directive,  by 
adding  pragma  directive  before  the  code  section  of  the  loop 
as  in  Figure  2.  The  accelerating  effectiveness  depends  on 
the  computational  architecture  and  data  dependencies  of 
the  algorithm.  Since  ADMM  consist  of  mainly  dot-product 
operations,  which  is  simple  computation  structure  and  high 
data  dependence,  so  that  it  allows  efficient  parallelism.  In 
Section  V  we  will  show  the  computation  acceleration  and 
resources  usage  of  ADMM  with  LPP. 

In  summary.  Table  II  shows  the  various  designs  imple¬ 
mented  on  FPGA. 


N  cycles 

\ 

nJTTL . 

_rLnji_n_n_ . ajl 

1  OPl  OP2  1  OP3 

[  OPk  j  OPl  j  OP2  |  OP3  |  OPk  ] 

1st  iteration 

2nd  iteration 

v _ 

Without  Loop  pipelining 

_ / 

1  cycle 

rUTTL 


OP3 


ji_n_n_rLTL . 

Blst  iteration 

op3  opk  2nd  iteration 

/  ■■  ■  i 


fLTL 


With  Loop  pipelining  (Initiation  Interval  11=2  cycles) 


Fig.  6.  Comparison  of  MHE  solutions  by  FPGA  and  PC 


V.  Results 


A.  Illustrative  Example 

As  an  illustration,  we  use  our  designs  shown  in  Table  II  to 
solve  the  following  MHE  example: 


'0.2695 

0.4779 

0.1127 

0.1339' 

'1' 

1  — 

0.2688 

0.4786 

0.0753 

0.1713 

Xk  + 

1 

0.0053 

0.0063 

0.5713 

0.0168 

1 

0.0035 

0.0081 

0.0094 

0.5782 

_1_ 

2 Ik  = 


1  0.5  1  0.5 


Xk  T  &k 


We  assume  the  constraints  are 


'2.2' 

'4.3' 

'-1' 

'1' 

2.4 

4.2 

-1 

1 

0 

2.7 

7 

-1 

V/ 

-sa 

V/ 

1 

.  0  . 

2.3 

.-1. 

.1. 

and 


Q  =  E  (wfeWfc  )  =  0.1  x  J4,  R  =  E  (ekel)  =  5, 
xq  =  [3  3  1.3  1.3] T  ,  uk  =  0.5. 


Fig.  5.  Loop  pipelining 


The  parameters  of  MHE  (4)  are 


Po  —  h,  Ao  —  [3  3  1.3 


1.3] 7  ,  N  =  5. 


B.  Hardware  Co-simulation  Result 


TABLE  II 

Various  “MHE  on  Chip"  designs  implemented 


Design  Name 

Description 

F132 

Single  precision  floating  point  without  LPP 

F132-LPP 

Single  precision  floating  point  with  LPP 

Fi25 

25-bit  word  fixed-point  without  LPP 

Fi25-LPP 

25-bit  word  fixed-point  with  LPP 

The  MHE  solutions  obtained  from  employing  MATLAB 
quadprog(),  m-File  ADMM  and  ADMM  implemented  on 
FPGA  are  depicted  in  Figure  6  which  shows  that  the  MHE 
solutions  in  MATLAB  on  a  PC  (red-square  line  and  green 
cross  line)  and  MHE  solution  by  ADMM  method  in  FPGA 
(blue-rhombus  line  and  pink-diamond)  are  slightly  different. 
The  fixed-point  version  is  almost  the  same  as  the  floating¬ 
point  version  on  FPGA.  Figure  7  show  the  relative  errors  in 
FPGA  computation)  both  fixed-point  and  floating  point),  and 
from  this  we  see  that  the  accuracy  are  acceptable  (less  than 
2%) 
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Fig.  7.  Relative  Error  between  different  computations 


C.  Average  running  time  and  iteration  time 

Table  III  shows  the  average  running  time  per  MHE 
sampling  instant  of  different  FPGA  designs  for  different  QP 
sizes.  The  QP  size  depends  on  the  estimation  horizon  N  and 
is  equal  to  (2 N  —  1).4  (see  (13)  )  in  our  MHE  example. 
The  second  column  of  Table  III  shows  the  average  iteration 
time,  to  achieve  an  accuracy  eo  =  10-6,  at  each  sampling 
instant  of  ADMM  in  FPGA  .  The  number  of  iterations 
of  ADMM  increases  as  the  QP  Size  grows.  Moreover,  the 
number  of  iterations  of  ADMM  also  depends  on  the  numerical 
characteristic  of  a  particular  QP  problem,  e.g  the  condition 
number  of  the  Hessian  matrix. 

Computational  acceleration  of  LPP  and  fixed-point 
designs 

From  column  6  &  9  of  Table  III,  we  see  that  FPP  sped  up 
the  computation  significantly.  For  a  fixed-point  design,  FPP 
speeds  up  about  7  times,  whereas  with  a  floating  point  design, 
it  is  about  30  times  faster  with  FPP. 

Also  as  expected,  fixed-point  design  runs  faster  than  floating 
point  design.  From  two  last  column  of  Table  III,  we  can  see 
that  the  speed  up  factor  is  about  4.7  and  1.2  for  designs  with 
and  without  FPP  respectively. 

Resource  Comparison 

As  expected,  fixed-point  design  is  not  only  faster,  but  also  it 
cost  less  hardware  resources  than  floating  point  design.  Also,  a 
design  with  FPP  which  speed  up  the  computation  costs  much 
more  resource  than  a  design  without  FPP.  As  an  illustration. 
Table  IV  shows  the  resource  usage  for  the  case  when  the 
estimation  horizon  N  =  5  resulting  in  QP  Size=  36. 


TABLE  IV 

Hardware  Resource  Comparison  (QP  Size  36) 


Design 

BRAM18s 

DSP48s 

LUTs 

FFs 

Fi2b 

16 

14 

1385 

837 

Fi25  -  LPP 

16 

157 

5377 

3977 

F132 

17 

16 

1538 

3521 

F132  -  LPP 

17 

32 

8976 

8562 

Available 

280 

220 

53200 

106400 

D.  Max  QP  size 

Base  on  the  resource  used  for  some  difference  QP  Size 
problems;  we  estimate  the  largest  QP  size  that  can  be  im¬ 
plemented  on  this  FPGA  board  is  about  300.  This  is  the 
largest  solvable  QP  problem  on  an  embedded  platform  in  the 
literature.  Future  work  will  be  carried  out  to  confirm  this 
number. 

VI.  Conclusion  and  Future  Works 

In  this  paper  we  have  implemented  the  ADMM  method 
on  an  FPGA  to  solve  QP  problems  arising  from  MHE,  and 
demonstrated  the  feasibility  of  the  “MHE  on  chip”  idea.  The 
average  running  time  per  MHE  sampling  instance  by  various 
FPGA  designs  are  compared  for  different  QP  sizes.  Also  we 
compare  the  speed  and  resources  usage  of  our  MHE-on-FPGA 
implementations  with  and  without  Foop  pipelining  as  well 
as  fixed-point  and  floating  point  arithmetic.  Foop  pipelining 
technique  sped  up  significantly  the  computation  but  much  more 
resources  used.  Fixed-point  design  not  only  faster  but  also 
cheaper  than  a  floating  point  design.  A  natural  extension  of 
this  work  includes 

1)  Establish  the  generally  analytical  bound  for  fixed-point 
arithmetic  implementation,  since  the  current  bound  is 
based  on  the  simulation 

2)  Accelerate  computation  on  FPGA  by  further  exploring 
the  structure  and  property  of  ADMM  and  MHE 

3)  ADMM-based  MHE  on  FPGA  for  nonlinear  plant  and 
apply  the  ADMM  implemented  on  FPGA  for  MHE  to  a 
real  plant. 
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QP 

Size 
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Iteration 

Fixed-point(FPGA) 

Floating-point(FPGA) 

Fixed-point  speed  up  factor3 

Fi25 

(ms)1 

Fi25-LPP 

(ms)1 

Speed  up  factor2 

Fi25 

F132 

(ms)1 

F132-LPP 

(ms)1 

Speed  up  factor2 

F132 

F132 

Fi25 

F132  —  LPP 
Fi25—LPP 

Fi25  —  LPP 

F132—LPP 

5 
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25.5 
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25.82 

4.69 

1.23 
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67 
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26.9 

4.61 

1.18 

8 

60 

80 

24.2 

3.4 
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3.85 

29.3 

4.66 

1.13 

10 

76 

90 

43.2 

5.9 

7.3 

202.25 

6.62 

30.6 

4.68 

1.12 

12 

92 

98 

68.5 

9.2 

7.4 

321.49 

11.68 
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4.69 
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3  Speed  up  factor  of  fixed-point  design 
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A  Simulink  Model  for  “MHE  on  a  Chip” 


Figure  1:  FPGA-in-the-Loop  Simulation  System 
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out2_0_z0  > 


in3_y0 

in3_y1 

in3_y2 

in3_,3 

i"3_y« 

in4_muy1 

in5_muy3 

in6_efr 

in7_«0 


out3_xM0  > 

out3_xH1  > 


oul3_xH3  > 


out4_xArrive  > 


Figure  2:  FPGA  Harware  Co-Simulation  Block 


Pulse  Generator  Delayl 

Figure  3:  Control  Logic  Block 


Figure  4:  Plant  Data  Y 
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zEnable 


Figure  5:  Setting  Block 


Buffer  [5] 


Figure  6:  Buffer  Block 
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B  “MHE  on  a  Chip”  Experimental  Setup 


Figure  7:  Zynq  ZC702  APP  SoC 


Figure  8:  Hardware  Co-Simulation 
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C  C++  Code  for  “MHE  on  a  Chip” 


1 

2  C++  Code 

3  /* 

4  *  File  mhe_admm.cpp 

5  *  By  DANG  Van  Thuy  @  Control  Engineering  Lab  @  NTU,  Singapore 

6  *  May  2014 

7  *  Ver  1 . 0 

8  *  MHE  Project 

9  */ 

10 

n  finclude  "mhe_admm.h" 

12 

13  //MHE  base  on  ADMM  function  declaration 

14  int  mhe.admm (dTypel  y[N],dTypel  muyO [n] ,  dTypel  TOL [ 1 ] , dType2  ro[0],dTypel  v[N-l],  dTypel  ... 

z [ size ] , dTypel  xH[n],int  loops [ 1 ], dTypel  xArrive [n] ) 

15  //Input:  y,  muyO,  TOL  (Tolerance),  ro,  v(u) 

16  //Output:  z,  xH,  Loops  counter,  xArrive 

17  { 

is  dTypel  zK [ size ], zKl [ size ], uk [ size ] ,  ukl [ size ], xKl [ size] ,  h[size],  g [ 1 6 ] ,  ml2g[size]; 

19  dTypel  epsilon=0; 

20  int  i, j , t , loopCounter, count ; 

21  dTypel  temp; 

22  dTypel  zTemp [size] ;//, thetaTemp [size]  ; 

23  dType2  rod; 

24  dTypel  epsilon_0; 

25  int  loopCount; 

26  int  loopNum; 

27  ro_l=0.25; 

28  epsilon_0=TOL  [ 0 ]  ; 

29 

30  //Variable  Initiation 

31  MHE_admm_init : for (i=0; i<size; i++) 

32  { 

33  zK  [  i  ]  =zMin  [  i  ]  ; 

34  uk[i]=0; 

35  xKl [ i ] =zMin [i]  ; 

36  zKl [ i ] =zMin [i] ; 

37  ukl  [  i  ]  =0; 

38  h[i]=0; 

39  } ; 

40 

41  //  h  ,  g  calculation  h=-Gama 1 *Lamda 1 *Y=-GaLa* [muyO , Y] ; 

42  MHE_hCalculation : 

43  for (i=0; i<20; i++) 

44  { 

45  temp=0; 

46  for (j=0; j<4; j++) 

47  { 

48  temp=temp+GaLa [ i ] [ j ] *muy0 [ j ] ; 

49  } 

50 

51  for  (j  =  4;  j<9;  j++) 

52  { 

53  temp=temp+GaLa [ i ]  [ j ] *y [ j-4 ]  ; 

54  } 

55  h[i]=temp; 
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56 


} 


57 

58  //  g  Calculation  g=-Bba*v  =[-u0  ; -uO  ;u0  -uO . . . . 

59  admmQP_cCalculat ion :for(i=0;i<4;i++) 

60  { 

61  g  [  4*i  +  l  ]  =-v  [  i  ]  ; 

62  g [4*i+2 ] =-v [i] ; 

63  g [4*i+3 ] =-v [i]  ; 

64  g  [  4  *  i  ]  =— v  [  i  ]  ; 

65  } 

66 

67  //ml2g=M12  *g  calculation  36x16  16x1 

68  MHE_admm_M12gCalculat ion :  f or  (i=0;  Ksize;  i++) 

69  { 

70  temp=0; 

71 

72  loop_label4:for(j=0;j<16;j++) 

73  { 

74  temp=temp+M12 [ i ] [ j ] *g [ j ] ; 

75  } 

76  ml2g  [  i  ]  Kemp; 

77  } 

78 

79  /  /  i  i  i  i  i  i  i  i  i  i  i  i  i  i  i  i  i  i  i  i  i  i  i  i  i  i  i  i  i  i  ADMM  loop  -n-r^-rn-r^-rn-r^-rn-rn-r-rn-i-i-i-i-i-i-i-i-i-r-i-i-r-i-r-i-i-i-i-T-i-r-r-i-T-i 


so  loopCounter=0 ; 

81  epsilon=l; 

82 

83  ADMM_loop:do 

84  { 

85  / /xKl=Ml 1* (-h+ro* ( zK-uk) ) +M12*g 

86  Loop_l  :  for  ( i=0 ;  Ksize;  i++) 

87  { 

88  #pragma  HLS  PIPELINE  //Loop  Pipelining  Directive 

89 

90  temp=0; 

91  Loop_l .  1 : for ( j  =  0; j<size; j++) 

92  { 

93  temp=temp+Ml 1 [ i ] [ j ] * (-h [ j ] +ro [ 0 ] * ( zK [ j ] -uk [  j  ]  )  )  ; 

94  } 

95  xKl [ i ] =temp+ml2g [ i  ]  ; 

96  zTemp [ i ] =xKl [ i ] +r o_l *uk [ i ]  ; 

97  } 

98 

99  //  Projection 

100  Loop_2  :  for  ( i=0 ;  Ksize;  i++) 

101  { 

102  fpragma  HLS  PIPELINE 

103 

104  if ( zTemp [ i] <zMax [ i ] )  zKl [ i ] =zTemp [ i ] ; 

105  if  ( zTemp [ i] >zMax [ i ] )  zKl [ i ] =zMax [i] ; 

106  if  ( zKl  [  i]  <zMin  [  i  ]  )  zKl  [  i  ]  =zMin  [  i  ]  ; 

107  } 

108 

109  //ukl=xKl+uk-zKl 
no  epsilon=0; 

in  Loop_3  :  for  ( i=0 ;  Ksize;  i++) 

112  { 

113  #pragma  HLS  PIPELINE 

114 

115  ukl[i]=  xKl  [i ] +uk  [ i] -zKl  [ i]  ; 
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116 

117 

118 

119 

120 
121 
122 

123 

124 

125 

126 

127 

128 

129 

130 

131 

132 

133 

134 

135 

136 

137 

138 

139 

140 

141 

142 

143 

144 

145 

146 

147 

148 

149 

150 

151 

152 

153 

154 

155 

156 

157 

158 

159 

160 

161 

162 

163 

164 

165 

166 

167 

168 

169 

170 

171 

172 

173 


epsilon=epsilon+ (zKl [i]-zK[i] ) * (zKl [i]-zK[i] ) ; 
zK [i ] =zKl [i] ; 
uk [i] =ukl [ i ] ; 

} 

loopCounter=loopCounter+l; 

} 

while (epsilon>epsilon_0 )  ; 

/  /  — i—i— i—i— i—i— i—i— i—i— i—i— i—i— i—i— i—i— i—i— i—i— i—i— i—i— i—i E  n  d  ADMM— i—i— i—i— i— i—i— i— i—i— i— i—i— i— i—i— i— i—i— i— i—i— i—i— i— i—i— i— i—i— i— i—i 

/ / Output 

admmQP_out Z : f or (i=0 ; i<size; i++) 

{ 

z [i] =zK [ i ] ; 

} 

loops [ 0 ] =loopCounter ; 

MHE_admm_outXhat  :  for  (i=0 ;  i<4;  i++) 

{ 

xH [ i ] =zK [ i ] ; 
xArrive [ i ] =zK [ 1 6+i ] ; 

} 

/ / ack [ 0 ] =1 ; 

Return  loops [0]; 

} 


/  * 

*  File  mhe_admm.h 

*  By  DANG  Van  Thuy  @  Control  Engineering  Lab 

*  NTU,  Singapore 

*  May  2014 

*  Ver  1.0 

*  MHE  Project 

*  / 

#ifndef  _admmQP_H 
#define  _admmQP_H 
# include" ap -fixed . h " 

#include " stdio . h" 

#include  "math.h" 

typedef  ap_f ixed<25,  6, AP_RND ,  AP_SAT>  dTypel;  //  Fixed  point  Data  Type  ,  25  bits  long, 
6  integer  bits 

typedef  ap_f ixed<18, 8, AP_RND ,  AP_SAT>  dType2;  //  Fixed  point  Data  Type  ,  18  bits  long, 
8  integer  bits 


tdefine  N  5 
#define  M  4 
#define  n  4 
fdefine  Nn  20 
tdefine  size  36 


//  QP  Size  (2*N-l)*n 


#define  wL  16 
#define  mNu  12 


//  MHE  base  on  ADMM  function 


7 

Distribution  A:  Approved  for  public  release.  Distribution  is  unlimited 


174 

175 

176 

177 

178 

179 

180 
181 
182 

183 

184 

185 

186 

187 

188 
189 


int  mhe_admm (dTypel  y[N], dTypel  muyO [n] ,  dTypel  TOL [ 1 ] , dType2  ro[0],dTypel  v[N-l],  dTypel  ... 
z [ size] , dTypel  xH[n] , int  loops [ 1 ], dTypel  xArrive[n] ) ; 

//  Mil,  M12  pre-computed  and  loaded  Mil:  36x36  Matrix,  M12:  36x16  Matrix 
const  dTypel  Mil [size] [size]= 

{ 

{0.0861026975072449, 0.0115724687727949, -0.00127297278375129, -0.000 92 544 32 08968007, 0.0199957143 
,  .  .  .,  0.0564792028859685}, 

}; 


7645,  . 


const  dTypel  M12[size] [wL]= 

{ 

{0.118608904 52 1333, 0.13640580586447, 0.0147584872209736, 0.01 92 46 62 87930242, 0.0574180881243037,. 

\\ 

.  .  . ,  0.20929115959644}, 

}; 


190 


191 

192 

193 

194 

195 

196 

197 

198 

199 

200 
201 
202 

203 

204 

205 

206 

207 

208 

209 

210 
211 
212 

213 

214 

215 

216 

217 

218 

219 

220 
221 
222 

223 

224 

225 

226 

227 

228 


const  dTypel  zMax[size]= 

{ 

4.30000000000000, 

4.20000000000000, 

2.70000000000000, 

2.30000000000000, 

4.30000000000000, 

4.20000000000000, 

2.70000000000000, 

2.30000000000000, 

4.30000000000000, 

4.20000000000000, 

2.70000000000000, 

2.30000000000000, 

4.30000000000000, 

4.20000000000000, 

2.70000000000000, 

2.30000000000000, 

4.30000000000000, 

4.20000000000000, 

2.70000000000000, 

2.30000000000000, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 


229 

230  }  ; 

231 

232  const  dTypel  zMin[size]= 

233  { 
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234 

2  . 

235 

2  . 

236 

0, 

237 

0, 

238 

2  . 

239 

2  . 

240 

0, 

241 

0, 

242 

2  . 

243 

2  . 

244 

0, 

245 

0, 

246 

2  . 

247 

2  . 

248 

0, 

249 

0, 

250 

2  . 

251 

2  . 

252 

0, 

253 

0, 

254 

-1 

255 

-1 

256 

-1 

257 

-1 

258 

-1 

259 

-1 

260 

-1 

261 

-1 

262 

-1 

263 

-1 

264 

-1 

265 

-1 

266 

-1 

267 

-1 

268 

-1 

269 

-1 

270 

271 

}  ; 

272 

273 

const  dTyp 

274 

{ 

275 

{- 

276 

{3 

277 

{o 

278 

{o 

279 

{o 

280 

{o 

281 

{o 

282 

{o 

283 

{o 

284 

{o 

285 

{o 

286 

{o 

287 

{o 

288 

{o 

289 

{0 

290 

{0 

)000000000, 

)000000000, 


)000000000, 

)000000000, 


)000000000, 

)000000000, 


)000000000, 

)000000000, 


)000000000, 

)000000000, 


dTypel  GaLa[20] [9]= 

.99897353428822  ,3.00726673635644  ,0.397510911062318 

,-0.200000000000000  ,0  ,0  ,0  ,0}, 

00726673635644  ,-6.95879175306943  ,0.190294233491864 

,-0.100000000000000  ,0  ,0  ,0  ,0}, 

397510911062318  ,0.190294233491864  ,-6.88479688368285 

,0.00358283802374717  ,-0.200000000000000  ,0  ,0  ,0 


,-6.86404367001155  ,-0.100000000000000  ,0 

0  ,0  ,0  ,0  ,-0.200000000000000  ,0  ,0 
-0.100000000000000  ,0  ,0 
-0.200000000000000  ,0  ,0 
-0.100000000000000  ,0  ,0 


,0 

,0 

,0 

,0 

,0 

,0 

,0 

,0 


,0 

,0 

,0 

,0 

,0 

,0 

,0 

,0 

,0 

,0 

,0 


,0 
,0 
,0 
,0 
,  o 
,0 
,0 
,0 
,0 
,  o 
,0 


,  0 
,  0 
,0 
,0 
,0 
,  0 
,  0 
,  0 
,0 
,0 
,0 


,  0  ,0 
,0}, 
,0}, 
,0}, 
,0}, 
,0}, 
,0}, 
,0}, 
,0}, 


,  0}, 


,0  ,-0.200000000000000  ,0 

,0  ,-0.100000000000000  ,0 

,0  ,-0.200000000000000  ,0 

,0  ,-0.100000000000000,  0 

,0  ,0  ,-0.200000000000000  ,0}, 

,0  ,0  ,-0.100000000000000  ,0}, 

,0  ,0  ,-0.200000000000000  ,0}, 

,0  ,0  ,-0.100000000000000  ,0}, 


0 . 482987654546387 
0 . 691613831146628 


0}, 
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291 

{o 

,0 

,  o 

,0 

,0 

,0 

,0 

,0 

,-0.200000000000000}, 

292 

{o 

,0 

,  o 

,0 

,0 

,0 

,0 

,0 

,-0.100000000000000}, 

293 

{0 

,0 

,0 

,0 

,0 

,  o 

,0 

,0 

,-0.200000000000000}, 

294 

{0 

,0 

,0 

,o 

,0 

,  o 

,0 

,0 

,-0.100000000000000} 

295 

296  }  ; 

297 

298  #endif 
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